CloudTrail Enrichment with CloudWatch Logs Transformation

Introduction

AWS CloudTrail provides comprehensive audit coverage of AWS API activity, creating a complete security and compliance foundation for organizations. When delivering these logs to Amazon CloudWatch Logs, CloudWatch Logs Transformation enables organizations to enrich and optimize CloudTrail data without custom Lambda functions, external ETL pipelines, or post-processing scripts.

Using declarative JSON processor configurations, you can parse nested fields, add security context, classify resources, and optimize data for downstream delivery as CloudTrail events flow into CloudWatch Logs. This guide demonstrates practical transformation patterns for security monitoring, compliance reporting, and operational efficiency while maintaining the simplicity and reliability of AWS-native log management.

Why This Matters

Organizations delivering CloudTrail logs to CloudWatch Logs often need to enhance this data to align with specific operational workflows and tooling requirements:

Security teams want to add custom risk indicators and classification tags to accelerate threat detection workflows
Compliance teams need to pre-classify events by regulatory framework (PCI-DSS, HIPAA, SOC2) to streamline audit responses
Operations teams managing multi-account environments want to add business context like environment labels, cost centers, or team ownership to CloudTrail's technical event data
All teams forwarding data to downstream systems (SIEMs, OpenSearch, S3) want to optimize data structure—flattening nested fields for tool compatibility or focusing on security-relevant fields to reduce downstream ingestion costs

Without native transformation capabilities, teams resort to building custom Lambda functions, maintaining external ETL pipelines, or performing post-processing—adding complexity, latency, and operational overhead to their log management infrastructure.

How CloudWatch Logs and Transformation Work

CloudWatch Logs

Amazon CloudWatch Logs serves as a audit log destination for CloudTrail. When CloudTrail delivers logs to CloudWatch Logs, each API event becomes a log event organized within log groups and streams, enabling organizations to:

Query recent API activity using CloudWatch Logs Insights
Create security alerts with metric filters and alarms
Forward logs to downstream systems using subscription filters

CloudWatch Logs Transformation

CloudWatch Logs Transformation enables modification of log data during ingestion using declarative processors. Transformations are defined as JSON configurations that specify operations like:

parseJSON: Parse JSON structures and extract nested fields
copyValue: Copy values to new fields for enrichment
substituteString: Perform pattern-based string replacements
deleteKeys: Remove unnecessary fields

When applied to a log group, transformations execute automatically on every incoming log event before storage. Both the original and transformed versions are retained in CloudWatch Logs, with subscription filters forwarding the transformed data to downstream systems and CloudWatch Logs Insights queries displaying the transformed version for analysis. Note that the GetLogEvents and FilterLogEvents APIs return the original log version, not the transformed version.

The Solution

CloudWatch Logs Transformation addresses these challenges by providing native, real-time enrichment capabilities that eliminate custom infrastructure while delivering immediate operational value. The following sections provide samples on how organizations can leverage transformations across four key areas:

Security Monitoring

Organizations can streamline threat detection by adding enriched fields to CloudTrail's comprehensive event data:

Instant threat detection: Add is_root_user flags for immediate filtering (see Use Case #4: Root User Activity Detection)
Resource sensitivity tagging: Automatically classify S3 buckets based on naming patterns (see Use Case #1: S3 Data Classification)
Simplified alerting: Create CloudWatch alarms using metric filters on enriched fields without complex JSON parsing
SIEM-ready data: Flatten nested fields for seamless integration with security tools (see Use Case #2: Flattening Nested Fields)

Optimized Data Delivery

CloudTrail data events provide comprehensive audit coverage, generating millions of logs daily. Organizations can optimize this data for specific downstream systems:

Streamlined downstream delivery: Remove verbose fields before sending to S3, OpenSearch, or third-party SIEMs via subscription filters (see Use Case #3: Optimized Downstream Delivery)
Selective field retention: Keep only security-critical data while discarding operational noise
Improved query performance: Smaller, flattened log structures mean faster CloudWatch Logs Insights queries
Reduced downstream costs: Send only relevant data to external systems, reducing their ingestion and storage costs

info

Note: Both original and transformed logs are stored in CloudWatch Logs. The primary benefit is optimizing data sent to downstream systems via subscription filters, not reducing CloudWatch Logs storage costs.

Operational Efficiency

Organizations with dozens or hundreds of AWS accounts can streamline correlation of CloudTrail events across environments:

Environment tagging: Automatically label events as production, staging, or development based on account ID (see Use Case #5: Multi-Account Environment Tagging)
Standardized field names: Flatten nested fields like userIdentity.type and sourceIPAddress for consistent querying across all accounts (see Use Case #2: Flattening Nested Fields)
Business context: Add compliance framework tags at ingestion time (see Use Case #6: Compliance Framework Tagging)
Simplified cross-account analysis: Query all accounts using consistent field names in CloudWatch Logs Insights

Compliance and Audit Readiness

Organizations can accelerate audit responses by pre-classifying CloudTrail events:

Compliance framework tagging: Automatically tag PCI-DSS, HIPAA, or SOC2-relevant events (see Use Case #6: Compliance Framework Tagging)
Root user monitoring: Flag root user activity for compliance audits (see Use Case #4: Root User Activity Detection)
Retention optimization: Separate critical audit data from operational logs for different retention policies
Faster audit responses: Pre-classified logs enable instant filtering during compliance reviews

Common Use Cases and Solutions

The following examples demonstrate practical transformation patterns for CloudTrail logs. Each use case includes a specific challenge, the processor configuration to address it, and the resulting benefits. These patterns can be combined or adapted to meet your organization's specific security monitoring and operational requirements.

1. S3 Data Classification for Sensitive Resource Identification

Challenge: Security teams struggle to quickly identify which CloudTrail events involve sensitive or production S3 buckets without manually inspecting each ARN.

Solution: Automatically classify S3 resources based on bucket naming patterns.

[
  {
    "parseJSON": {
      "source": "@message"
    }
  },
  {
    "copyValue": {
      "entries": [
        {
          "source": "resources.0.ARN",
          "target": "data_classification"
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "data_classification",
          "from": ".*-prod-.*",
          "to": "sensitive"
        },
        {
          "source": "data_classification",
          "from": "^arn:aws:s3:::.*",
          "to": "normal"
        }
      ]
    }
  }
]

Benefit: Security analysts can filter by data_classification field to instantly identify sensitive resource access.

Query Example:

fields @timestamp, eventName, userIdentity.arn, data_classification
| filter data_classification = "sensitive"
| sort @timestamp desc

2. Flattening Nested Fields for SIEM Integration

Challenge: SIEM tools require flat field structures. CloudTrail's detailed JSON structure can be flattened to align with SIEM requirements.

Solution: Extract and flatten commonly queried nested fields.

[
  {
    "parseJSON": {
      "source": "@message"
    }
  },
  {
    "copyValue": {
      "entries": [
        {
          "source": "userIdentity.type",
          "target": "user_type",
          "overwriteIfExists": true
        },
        {
          "source": "sourceIPAddress",
          "target": "source_ip",
          "overwriteIfExists": true
        },
        {
          "source": "awsRegion",
          "target": "region",
          "overwriteIfExists": true
        }
      ]
    }
  }
]

Benefit: Standardized field names across all accounts simplify SIEM correlation rules and reduce configuration complexity.

Query Example:

fields @timestamp, eventName, user_type, source_ip, region
| filter region = "us-east-1"
| sort @timestamp desc

3. Optimized Downstream Delivery Through Field Reduction

Challenge: CloudTrail data events generate massive volumes. Organizations can focus on security-relevant fields when forwarding to downstream systems.

Solution: Remove fields before forwarding via subscription filters.

[
  {
    "parseJSON": {
      "source": "@message"
    }
  },
  {
    "deleteKeys": {
      "withKeys": [
        "responseElements",
        "requestParameters"
      ]
    }
  }
]

Benefit: Reduces data volume sent to downstream systems (S3, OpenSearch, SIEMs), lowering their ingestion and storage costs while maintaining all security-relevant data.

info

Important: Both original and transformed logs are stored in CloudWatch Logs. Subscription filters forward the transformed version, enabling cost savings in downstream systems. Only delete fields not required for your security monitoring. The example above removes verbose fields (responseElements and requestParameters) but retains core audit data like eventName, userIdentity, sourceIPAddress, and eventTime. Note that deleteKeys will only delete fields that exist in the event - if a field doesn't exist, it will be silently skipped. Add additional fields like additionalEventData, resources, or serviceEventDetails to the list based on your specific requirements.

Query Example:

fields @timestamp, eventName, userIdentity.type, sourceIPAddress
| filter eventName like /Put|Delete|Create/
| sort @timestamp desc

4. Root User Activity Detection

Challenge: Identifying root user activity requires parsing the userIdentity.type field. Organizations can simplify alert creation by adding explicit flags.

Solution: Add explicit boolean flag for root user detection.

[
  {
    "parseJSON": {
      "source": "@message"
    }
  },
  {
    "copyValue": {
      "entries": [
        {
          "source": "userIdentity.type",
          "target": "is_root_user",
          "overwriteIfExists": true
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "is_root_user",
          "from": "Root",
          "to": "true"
        },
        {
          "source": "is_root_user",
          "from": "(IAMUser|AssumedRole|FederatedUser|AWSAccount|AWSService)",
          "to": "false"
        }
      ]
    }
  }
]

Benefit: Enables simple filtering for root user activity: filter is_root_user = "true"

Query Example:

fields @timestamp, eventName, userIdentity.arn, sourceIPAddress, is_root_user
| filter is_root_user = "true"
| sort @timestamp desc

5. Multi-Account Environment Tagging

Challenge: Organizations with multiple AWS accounts need to quickly identify which environment (prod/staging/dev) generated each CloudTrail event.

Solution: Map account IDs to environment labels.

[
  {
    "parseJSON": {
      "source": "@message"
    }
  },
  {
    "copyValue": {
      "entries": [
        {
          "source": "recipientAccountId",
          "target": "environment",
          "overwriteIfExists": true
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "environment",
          "from": "111122223333",
          "to": "production"
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "environment",
          "from": "444455556666",
          "to": "staging"
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "environment",
          "from": "[0-9]{12}",
          "to": "development"
        }
      ]
    }
  }
]

Benefit: Enables environment-based filtering and alerting without maintaining account ID mappings in downstream systems.

Query Example:

fields @timestamp, eventName, userIdentity.arn, environment
| filter environment = "production"
| stats count() by eventName
| sort count desc

6. Compliance Framework Tagging

Challenge: Compliance teams need to quickly filter CloudTrail events relevant to specific regulatory frameworks (PCI-DSS, HIPAA, SOC2) during audits.

Solution: Automatically tag events based on compliance-relevant patterns.

info

Note: The following is an example of how to add tags related to compliance frameworks. The eventName mapping shown in the below example doesn't correlate to any specific framework.

[
  {
    "parseJSON": {
      "source": "@message"
    }
  },
  {
    "copyValue": {
      "entries": [
        {
          "source": "eventName",
          "target": "compliance_framework",
          "overwriteIfExists": true
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "compliance_framework",
          "from": ".*(CreateKey|DeleteKey|DisableKey|ScheduleKeyDeletion|PutKeyPolicy).*",
          "to": "PCI-DSS,HIPAA"
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "compliance_framework",
          "from": ".*(CreateAccessKey|DeleteAccessKey|UpdateAccessKey|CreateUser|DeleteUser).*",
          "to": "SOC2,PCI-DSS"
        }
      ]
    }
  },
  {
    "substituteString": {
      "entries": [
        {
          "source": "compliance_framework",
          "from": ".*(PutBucketEncryption|DeleteBucketEncryption|PutBucketPolicy|DeleteBucketPolicy).*",
          "to": "HIPAA,PCI-DSS"
        }
      ]
    }
  }
]

Benefit: Enables instant filtering of compliance-relevant events during audits without maintaining separate event catalogs.

Query Example:

fields @timestamp, eventName, userIdentity.arn, compliance_framework
| filter compliance_framework like /PCI-DSS/
| sort @timestamp desc

Best Practices

Successful CloudWatch Logs Transformation implementations require careful planning and ongoing maintenance. These best practices cover design principles, performance optimization, security considerations, and cost management to help you build reliable and efficient transformation pipelines.

Design Principles

Start Simple: Begin with basic transformations and add complexity as needed
Test Thoroughly: Validate transformations with sample CloudTrail events before production deployment
Document Patterns: Maintain documentation of regex patterns and their intended matches
Version Control: Track transformation configurations in source control for change management

Performance Optimization

Minimize Processor Count: Use fewer, well-designed processors rather than many small ones
Minimize Regex Complexity: Use simple patterns when possible to improve performance
Limit Field Operations: Only copy or transform fields necessary for downstream analysis
Test at Scale: Validate transformation performance with realistic log volumes

Security Considerations

Avoid PII Exposure: Never add PII to custom fields without proper data handling controls
Validate Patterns: Ensure regex patterns don't inadvertently expose sensitive data
Audit Transformations: Regularly review transformation logic for security implications
Preserve Audit Integrity: Ensure transformations don't remove fields required for compliance or forensic analysis

Cost Management

Optimize Downstream Delivery: Remove unnecessary fields before forwarding to external systems via subscription filters to reduce downstream ingestion costs
Balance Storage vs Query Performance: Consider trade-offs between storing additional enriched fields and query complexity
Monitor Transformation Metrics: Track CloudWatch Logs metrics for transformation errors and performance
Review Regularly: Periodically assess whether transformations still align with current requirements

Querying Original vs Transformed Logs

When transformations are applied to a log group, both the original and transformed versions are stored in CloudWatch Logs. Understanding how to access each version is important for validation and troubleshooting.

CloudWatch Logs Insights Behavior

Default: CloudWatch Logs Insights queries display the transformed version of logs
Original Access: The original log content is always available in the @message field
API Behavior: The GetLogEvents and FilterLogEvents APIs return the original log version

Query Examples

Query transformed logs (default behavior):

fields @timestamp, eventName, user_type, source_ip, region
| filter region = "us-east-1"
| sort @timestamp desc

Query original logs using @message:

fields @timestamp, @message
| parse @message /"eventName":"(?<original_eventName>[^"]+)"/
| filter original_eventName like /Create/
| sort @timestamp desc

Compare original and transformed side-by-side:

fields @timestamp, @message as original_log, eventName, user_type, region
| limit 10

This dual-storage approach ensures you can always access the original audit trail while benefiting from enriched, transformed data for day-to-day operations.

Implementation Steps

Identify Requirements: Determine which CloudTrail fields need enrichment or modification
Design Transformation Logic: Map out the processor chain and expected outcomes
Create Test Events: Generate sample CloudTrail events for validation
Configure Transformation: Apply the processor configuration to your log group
Validate Results: Query transformed logs using CloudWatch Logs Insights to verify correct processing
Monitor and Iterate: Continuously improve transformations based on operational feedback

Conclusion

CloudWatch Logs Transformation enables organizations to maximize the value of CloudTrail data delivered to CloudWatch Logs by enriching events at ingestion time with security context, flattening complex JSON structures, and optimizing downstream delivery—all through native AWS capabilities. Security and operations teams can transform their CloudTrail events into actionable intelligence without the operational overhead of custom processing infrastructure. This guide provides the patterns, best practices, and implementation strategies needed to unlock these capabilities, enabling simplified compliance reporting and reduced downstream costs while maintaining complete audit trails for your AWS environment.

Introduction​

Why This Matters​

How CloudWatch Logs and Transformation Work​

CloudWatch Logs​

CloudWatch Logs Transformation​

The Solution​

Security Monitoring​

Optimized Data Delivery​

Operational Efficiency​

Compliance and Audit Readiness​

Common Use Cases and Solutions​

1. S3 Data Classification for Sensitive Resource Identification​

2. Flattening Nested Fields for SIEM Integration​

3. Optimized Downstream Delivery Through Field Reduction​

4. Root User Activity Detection​

5. Multi-Account Environment Tagging​

6. Compliance Framework Tagging​

Best Practices​

Design Principles​

Performance Optimization​

Security Considerations​

Cost Management​

Querying Original vs Transformed Logs​

CloudWatch Logs Insights Behavior​

Query Examples​

Implementation Steps​

Conclusion​

Introduction

Why This Matters

How CloudWatch Logs and Transformation Work

CloudWatch Logs

CloudWatch Logs Transformation

The Solution

Security Monitoring

Optimized Data Delivery

Operational Efficiency

Compliance and Audit Readiness

Common Use Cases and Solutions

1. S3 Data Classification for Sensitive Resource Identification

2. Flattening Nested Fields for SIEM Integration

3. Optimized Downstream Delivery Through Field Reduction

4. Root User Activity Detection

5. Multi-Account Environment Tagging

6. Compliance Framework Tagging

Best Practices

Design Principles

Performance Optimization

Security Considerations

Cost Management

Querying Original vs Transformed Logs

CloudWatch Logs Insights Behavior

Query Examples

Implementation Steps

Conclusion